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Abstract 


The Turing factorization is a generalization of the stan- 
dard LU factoring of a square matrix. Among other ad- 
vantages, it allows us to meet demands that arise in a 
symbolic context. For a rectangular matrix A, the gen- 
eralized factors are written PA = LDUR, where R is 
the row-echelon form of A. For matrices with symbolic 
entries, the LDU R factoring is superior to the standard 
reduction to row-echelon form, because special case infor- 
mation can be recorded in a natural way. Special interest 
attaches to the continuity properties of the factors, and 
it is shown that conditions for discontinuous behaviour 
can be given using the factor D. We show that this is 
important, for example, in computing the Moore-Penrose 
inverse of a matrix containing symbolic entries. 

We also give a separate generalization of LU factoring 
to fraction-free Gaussian elimination. 


1 Introduction 


It is now fifty years since Alan M. Turing wrote the sem- 
inal paper [16], in which he introduced several ideas that 
have since had a profound effect on numerical analysis, 
through their popularization and completion by Wilkin- 
son [8]. One of the simplest, and yet most far-reaching, 
ideas was his recognition that direct methods for solving 
linear systems Ag = b can be written as matrix factor- 
izations, and in particular Turing proved that, once the 
‘partial pivoting’ strategy is chosen, every square nonsin- 
gular matrix A can be written uniquely (up to ties for 
the choice of pivots) as the following ‘resolution into the 
product of matrices’ 


PA=LDU, (1) 


where P is a permutation matrix, L is unit lower triangu- 
lar with entries of magnitude bounded by 1, D is diagonal 
with nonzero entries, and U is unit upper triangular. 
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There are several ‘standard’ variants of LU factoriza- 
tion for square matrices; the main two group the LD fac- 
tors together or instead group the DU factors together. 
Turing’s choice to separate out the diagonal entries did 
not, unfortunately, become ‘the’ standard, which is a pity 
because it has some advantages, such as greater symmetry 
if A is symmetric. Incidentally, his use of the word ‘res- 
olution’ where a modern author might use ‘factorization’ 
or ‘decomposition’ interchangeably, is another aspect of 
his paper that has been forgotten, but in this case we 
doubt whether any harm has been done. We shall use 
‘factorization’ or ‘factoring’, in the main. 

In this paper we generalize Turing’s factorization to the 
case of rectangular matrices, and show that this helps to 
deal with the so-called specialization problem!, which we 
now briefly discuss in the context of linear algebra. 

Some of the standard operations of linear algebra, in 
particular the reduction to row-echelon form, present dif- 
ficulties when implemented in computer algebra systems, 
because continuous input can lead to discontinuous out- 
put. We give an example below in which discontinuous 
behaviour is missed by a computer algebra system. 

We emphasize that it is discontinuous matrix properties 
that are the ultimate source of the difficulty. Properties 
such as rank or nonsingularity are sometimes discontinu- 
ous, even when the matrix is continuous, and this presents 
problems both for exact and for floating point computa- 
tion. The theory of ‘conditioning’ (initiated also by Tur- 
ing in [16]) was developed to deal with discontinuity in the 
floating-point context. Perhaps because most computer 
algebraic systems take a formally algebraic viewpoint, no 
similarly useful and satisfactory theory has yet been de- 
veloped for the case of exact computation. 

The difficulty is, however, well known in algebra sys- 
tems, and a number of approaches to it have been pro- 


1 We would be interested in printed references to this usage. The 
term ‘specialization problem’ appears to be well-known but we do 
not know a single paper, other than our own, which uses it. Fur- 
ther, there are other kinds of ‘specialization problems’ than the ones 
caused by removable discontinuities in expressions (see e.g. [15]), 
but we restrict ourselves to this type in this paper. 


posed and discussed (see e.g. [3]). Many of the proposals 
are based on appropriate modifications of the user inter- 
face to the individual systems, but for the problems stud- 
ied here, an alternative is to modify the mathematical 
setting so that special cases can be identified efficiently. 
We do this by formally replacing the notion of the ‘row 
echelon reduction’ with a factorization which preserves 
special-case information. 

We also take the opportunity to write fraction-free 
Gaussian elimination (see e.g. [5]) as a factorization. This 
offers similar advantages in the case of parameters, but is 
more useful conceptually in that we feel this is a simpler 
presentation of the fraction-free algorithm, which avoids 
determinants and Sylvester’s identity. 


2 Row Echelon Reduction 


We begin by discussing the difficulties associated with the 
traditional row-echelon reduction. This is taught to stu- 
dents as the transformation of a given matrix A into a 
succession of equivalent matrices by the use of row oper- 
ations. The transformations stop when the unique row- 
echelon form R is reached. The shortcomings of this ap- 
proach are illustrated by the following problem, adapted 
from a widely used textbook [12]. 


P.1) The matrix below is the augmented matrix for some 
system of equations: 


1 —2 3 sin £ 
1 4cosgz 3 3sin gz (2) 
—1 3 cosg— 3 cosg 


Find the values of the parameter æ for which the 
system has no, one, and infinitely many solutions. 


Every CAS has a command to obtain the reduced row- 
echelon form of a matrix. In Maple, the command is 
linalg[rref](A). In Maple V Release 3, this gave the 
error message ‘matrix entries must be rational polynomi- 
als’. In Maple V Release 4, this gave the error message 
‘unable to find a provably non-zero pivot’. 

This message shows the dichotomy between the ana- 
lytic and algebraic viewpoints. If you replace cosa with 
c and sing with s, then Maple V Release 4 will compute 
the row echelon form without a murmur, taking the al- 
gebraic viewpoint that the user is working over rational 
polynomials in indeterminates s and c. But given the in- 
formation that s was sin z and c was cos x, then the code 
recognized that the problem is analytic in nature, not 
algebraic, and refused to give a possibly wrong answer. 
This behaviour is not a very satisfactory way of handling 
the difficulty of discontinuities, and thus was changed for 
Release 5. 

In Maple V Release 5, rref(A) gives 
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2 cos(#) sin(#)—3 sin(#)—6 cos(x)—3 


1 0 0 “2eos(@)+l o o 
R(z)=|0 1 0 a 
001 2 cos(x)+2 sin(z)+1 


2 cos(#)+1 

(3) 
So now Maple takes the algebraic viewpoint that sin æ 
and cos æ are ‘indeterminates’. 

If, however, the user takes the analytic viewpoint, diffi- 
culties arise. By inspection it is clear that 2cosg + 1 = 0 
is a special case. For illustrative purposes, only one root 
of 1 + 2cosa = 0 need be considered, so substituting 
z = 27/3 into (2) and re-issuing the command rref gives 


10 2 0 
R(27/3)=| 0 1 -1/2 0 (4) 
00 0 1 


Inspection of R(x) in (3), however, gives no clue that 
there is a second special case, æ = 0, for which the row- 
echelon form is 


103 3 
R@=|0 101 (5) 
0000 


Identifying the appropriate value 6, which we will need in 
a later section, is left as an exercise for the reader. Note 
that this last row echelon form is necessary to answer the 
last part of the original textbook question. 

Turing’s idea of resolution of matrices into factors is 
now a mainstay of the modern view of linear algebra, 
with most of the important constructs being expressed 
this way (e.g. the FFT [17]). Row reduction stands out 
as an exception in being a transformation. Therefore, po- 
tentially zero divisors can be seen only by inspecting the 
intermediate working. The central difficulty, however, is 
the normalization required by row-echelon form in order 
to make it unique; this is a key property because many 
proofs in linear algebra rely on it. One way to tackle the 
problem would be to change the definition of row-echelon 
form to remove the need to normalize to 1, for example, 
by changing to the Hermite normal form for matrices of 
integers or univariate polynomials (see e.g. [11, p. 15]). 
However, the Hermite normal form is not defined for the 
general algebraic objects we wish to consider, such as our 
example above. In any event, the same difficulty of ‘los- 
ing’ a division by a parameter can be constructed for this 
normal form by considering polynomials with parameters 
in the coefficients. 

By generalizing LU factoring to include the row- 
echelon form, we can resolve the difficulties described 


above. Any terms that would be divisors in a reduction 
to row-echelon form will appear somewhere in the fac- 
tors, and therefore a computer system does not have to 
watch during the reduction to identify special cases for the 
user. Moreover, all information about the original matrix 
is preserved in a convenient form for the investigation of 
special cases. 


3 The Turing factorization 


Theorem 1: If A is an m x n matrix over C, there 
exists a permutation matrix P, a unit lower triangular 
matrix L, a nonsingular diagonal matrix D, and a unit 


upper triangular matrix U such that 


PA=LDUR, (6) 
where R is the (unique) row-echelon form of A. Though 
it is not very different from standard LU factoring of 
square matrices, we feel that the differences are important 
enough to give equation (6) a name. We call this the 
Turing factorization of A. 

Proof: Noble and Daniel [12] prove that there exists a 
nonsingular matrix F such that 


A=FR. 
Apply Turing’s LDU resolution [6, 16] to F, so we have 
PF=LDU. 


Note that since F is nonsingular, D is nonsingular. The 
Turing factorization (6) follows. Y 

We remark that if A is invertible then R = J. In this 
case, if we combine the factors D and U, then (one kind 
of) standard LU decomposition is obtained. If instead we 
combine the factors L and D, we get another standard. In 
the first combination, the nonsingularity requirement for 
D becomes a nonsingularity requirement for Uj; = DU 
and this is what is implemented in the Maple routine 
LUdecomp”. It is worth noting that LU factoring requires 
no more computation than row-echelon reduction itself; 
all that is required is that the steps in the calculation be 
stored. 

Our main theorem addresses the continuity of the row- 
echelon form. To motivate it, we return to the example 
problem and examine the row-echelon forms (2)-(4). If 
we take any point z = a which is removed from the special 
cases, we clearly have 


lim R(x) = R(a) . 


wa 
2 The routine LUdecomp was written by Michael Buckley, and was 
based in part on the program RowEchelon [4] from the share library. 
The paper [4] was never published in a journal, owing in part to the 
untimely death of Paddy Nerenberg. This present paper updates 
and corrects [4]. 
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If, however, a = 0 = 7/2 (this is the answer to our earlier 
exercise for the reader) we find 


1 0 0 

1 0 1 A 
0 1 3 
which clearly does not equal R(7/2), which is given in 
equation (5). Thus the row-echelon form is discontinuous 
at this special case. Moreover, this is the essential fea- 
ture we are looking for. Because the leading elements of 
each row have been normalized to 1, any change in the 
qualitative nature of the system, such as its rank, will be 
equivalent to a discontinuity. Finally, we notice that the 
original matrix is continuous at z = 7/2. We thus need 
a theorem characterizing when the row-echelon form of a 
matrix is discontinuous, even though the matrix itself is 
continuous. 


Definition. A matrix A(x) is continuous at a = a if each 
of its elements is continuous at x = a. 


Theorem 2: Let A(x) be a matrix depending upon one 
or more variables or parameters x, and let A be continu- 
ous at a point z = a. For any fixed g let A(x) have the 
Turing factorization P(£) A(x) = L(«)D(#)U(«)R(a). If 
det D(x) # 0 in some neighbourhood of a = a, then R(x), 
D(a), D(x), U(x) are all continuous at z = a and more- 
over P(x) may be taken constant in a neighbourhood of 
r= a: 

Proof. We may choose P(x) so that all denomina- 
tors of L(x) introduced by the factorization—that is, not 
present already in A(a#)—are factors of det D(x). This is 
a consequence of the construction of P, L, D, and U by 
Gaussian elimination of F (x) with partial pivoting. 

Since det D(x) # 0 in a neighbourhood of x = a 
by hypothesis, D7! exists and is continuous in a (pos- 
sibly smaller) neighbourhood of x = a because D7! 
is made up of rational polynomial combinations of el- 
ements of A(x), which are continuous at « = a and 
hence in some neighbourhood of x = a. Similarly since 
det L(x) = det U(x) = 1, we have that L7! and UT! exist 
and are continuous in a neighbourhood of z = a. 

Now P(x) may be taken constant in a neighbourhood 
of x = a because the pivots that arise in Gaussian elimi- 
nation with partial pivoting can have only a finite num- 
ber of zeros in a small enough neighbourhood of £ = a, 
because otherwise by continuity of A(x) we would have 
det D(a) = 0 which contradicts the hypothesis. 

Therefore we have 


R(«) =U ‘(a)D (a) L ‘(a)P 'A(a) 


and everything on the right hand side is continuous at 
paS: h. 
Corollary 1: The only places where R can be discon- 
tinuous are those where det D = 0. 
We now illustrate some of the theorem’s consequences. 


3.1 Examples 
3.1.1 Example 1. 


Parameterized matrices arise naturally in eigenvalue 
problems, such as the following. The example is chosen to 
illustrate that, even if an individual element of D is zero, 
it is only when det D = 0 that discontinuous changes in 
the row echelon form may occur. 

For ease of typography in this two-column format we 
combine D and U and use U = DU. 


Consider 
2 0 0 
B= 2 3e l 
—4 —2 0 


To find its eigenvalues and eigenvectors, we examine the 
row-echelon decomposition of A(A) = B — AI. We have 
TA(A) = 


1 0 0 2-A 0 0 

sx 1 0 0 3-A E 
A27—3A+2 

E 3 1 0 0 3-2 


In the case A = 2, the first diagonal element of U is zero, 
and when A = 3 the second diagonal element is zero. 
However, to find eigenvalues, we must test det A(A) = 
det U, = (1—A)(2—A)?. Thus A = 3 is not an eigenvalue 
and not a case in which the row-echelon form changes; 
instead, when A = 3, a row exchange could be intro- 
duced when calculating the decomposition to obtain the 
same row-echelon form as above, but with non-zero diag- 
onal entries in U1. For each distinct eigenvalue we must 
re-factorize. Further, the row echelon form allows im- 
mediate calculation of the desired eigenvectors, because 
(A — AI)z = 0 if and only if Rx = 0, since P, L, and UV; 
are nonsingular. For À = 2 the row-echelon form of A(A) 
is 


1 1/2 1/2 
R2)={0 0 0 
0 0 0 


showing that there are two linearly independent eigenvec- 
tors for this eigenvalue. The case A = 1 is treated sim- 
ilarly. This example was constructed using the method 


of [13]. 


3.1.2 Example 2. 


This example shows that even when det U1 = 0 the row- 
echelon form might still be continuous. Consider 


1 a 

I-A(a,b,c)= |1 6 

le 
1 0 0 1 a 0 1 0 
={1 1 0 0 b-a 0 0 1 
1 $= 1 0 0 1 0 0 
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Clearly det U1 = 0 when b = a. When, however, we 
recalculate the decomposition, we find P234 (a, a, e) = 


1 0 0 1 a 0 1 0 
1 1 0 0 c-a 0 0 1d], 
1 0 1 0 0 1 0 0 
where 
1 0 0 
Pyuz3= [0 0 1 
0 1 0 


The row-echelon form is unchanged unless also c = a, in 
which case we get 


1 0 0 1 
0 1 0 0 
0 0 1 0 


O DORA 


This example shows that our theorem is as strong as pos- 
sible, in that parameter values rendering U; singular can 
only be said to “need further investigation” . 


4 The Moore-Penrose Inverse 


Different methods for computing generalized inverses are 
discussed in [10]. We begin our treatment by following [6]. 
Given the SVD of the m by n matrix A, namely 


A=U5V*, (7) 


with © = diag(o1, 02,...,0,,0,...,0) we may define the 


n by m pseudo-inverse At by 
At = Vy (8) 
where Et = diag(1/oi,1/o0,...,1/o,,0,...,0). 
rank(A) = n, then At = (A? A)~1A?, while if m 
n =rank(A) then At = Av}. 
The pseudo-inverse satisfies (and is in fact defined by) 
the four Moore-Penrose conditions: 


AAtA = A (9) 
AtAAt = At (10) 
(AAT)* = AAt (11) 
(At A)* AtA (12) 


The definition (8) is very useful for computing the 
Moore-Penrose inverse of a matrix A containing numeri- 
cal entries. But it is of occasional interest to compute At 
for a matrix containing symbolic entries, and the compu- 
tation of explicit formulas for the entries of the SVD of a 
symbolic matrix is usually impossible. 


The following formula, which is based on a regulariza- 
tion procedure associated with the name of Tihonov [7], 
is more suited to symbolic computation (although as we 
will see it has severe limitations of its own): 

At = lim A*(AA* +tI)7'. (13) 
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This formula follows easily from (8), because the right 
hand side of (13) may be written 


VEU* (Ue ey (14) 
and from here it becomes 
Vdiag (01 /(07 +t), ...,07/(o2 +1),0,...,0)U*. (15) 


and it is clear that ¢ — 0 in this will give At. 

Remark 1. The above computation shows that the 
limit does not have to be taken one-sided, because all the 
entries in Et (t) are well-defined for |t| < op. 

Remark 2. It is very important to notice that this 
formulation starts with a constant matrix A. But the 
pseudo-inverse is not continuous, and so if we try to find 
the pseudo-inverse of a parameterized family of matrices, 
A(a) say, the family of inverses might contain gaps or 
‘holes’. 


Consider the following simple example: 


l e 
afe] T 
The Moore-Penrose inverse is, if e Æ 1, 
1 1 —e 
Tea 
talee w 


which is of course just the ordinary inverse of A. But if 
e = 1, we have 


Seah ae (18) 


4/1 1 


Notice that the limit as e goes to 1 of the general form (17) 
does not exist. This example demonstrates that 
. t . 
(lim A) # lim At. (19) 
el el 
Therefore, the Moore-Penrose inverse of a symbolic ma- 
trix will have ‘specialization’ problems—we may expect 
that a routine that computes the general form will give 
us an answer that is not necessarily correct for all values 


of the parameters. The following theorem addresses this 
issue in a constructive way, using provisos. 


Theorem 3: The following are equivalent. 


1. At(x) (where A: D C Rf Minn (R)) is continuous 
atzr=a€D. 
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2. rank(A) is constant in a neighbourhood of z = a, 
and the At given by formula (13) is correct in this 
neighbourhood. 


3. det(D(a)) # 0 in a neighbourhood of x = a, where 
A(x) = PL (a) D(«)U(«)R(«) is the Turing factoriza- 
tion of A(x). Again, the At given by formula (13) is 
correct in this neighbourhood. 


Proof: Consider the Singular Value Decomposition 
of A(x), namely A = U(«#)U(#)V*(x). The only re- 
quirement for the Tihonov formula to be correct, by 
reasoning similarly to the constant matrix case, is that 
01 > 09 >-+-+07 > 0 = Or41 = +--+ Oy. This is equivalent 
to saying that rank A(z) = r. Finally, rank(A) is con- 
stant whenever det(D) + 0 by the Turing factorization 
theorem. t 


4.1 Maple Code 


The following Maple subroutine computes the Moore- 
Penrose inverse and its proviso. 


MoorePenrose := proc(A::matrix, proviso: :name) 
local T,t, Ah; 
Ah := map(r—>evalc(conjugate(r)), 
linalg [transpose] (A)); 
linalg[LUdecomp](A, ’det’=proviso) ; 


T := evalm( Ah &* (A &* Ah + t*&*() )7(-1) ); 
map(limit, eval(T), t=0, right) 

end: 

It uses the long names linalg[transpose] and 


linalg[LUdecomp] so the routine will work even if 
linalg is not loaded in. Names are assumed to be real- 
valued by evalc, by default. The shorthand notation 
&*() is Maple’s syntax for an appropriately-sized iden- 
tity matrix. 


4.2 Examples 


It turns out to be convenient to verify that our 
computed Moore-Penrose inverse satisfies the the four 
Moore-Penrose conditions, by using the following pro- 
cedure. [It was very useful in debugging, for example.] 


MoorePenroseConditions := 
proc( A::matrix, Ap::matrix ) 
local herm; 
herm := m -> map(r->evalc(conjugate(r)), 
linalg[transpose](m)); 
print(map(normal,evalm( A &* Ap & A - A ))); 
print (map(normal,evalm( Ap&*A&*Ap - Ap))); 
print (map(normal@evalc, 
evalm( herm(A&*Ap) - A &* Ap))); 
print (map(normal@evalc, 
evalm( herm(Ap&*A) - Ap &* A))); 
end: 


The idea is that this procedure prints out four possibly 
differently-shaped zero matrices; if any of these matrices 
contains a nonzero entry, there may be a bug (but most 
probably just a weakness in zero-recognition). The fol- 
lowing Maple session explores these routines. 


> a := alpha[i] + I*alpha[l2]; 


a:=a,tlas, 


> b := beta[i] + I*betal2]; 
b := pb +I Pe 


> A := matrix(1,2,[a,b]); 


A:=|aitIaz +ib 


> M := MoorePenrose(A, ’prA’); 
Ar fag 
ay? + ay? b? Bo” 
M := 
-fı + I Bo 
Tang 2 2 2 
ay a2 By b2 


> MoorePenroseConditions(A, M); 


[o o] 
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w 
ul 


= 
ii 


prB; 


matrix(2,2,[1,e,e,1]); 


l e 


e 1 


MoorePenrose(B,’prB’); 


1 e 
-l+e? —-l1+e? 
e 1 
-l+e2 —1+e? 

Les 


MoorePenroseConditions(B,M) ; 


C := 


0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 


subs (e=1,eval(B)); 


Lo! S 
ies 


MC := MoorePenrose(C,’prC’); 


pre; 


MC := 


Ble BIR 
Ble Ble 


MoorePenroseConditions(C,MC) ; 


This yields four 4 by 4 matrices containing only zeros. 


4.3 Limitations 


This routine uses symbolic inversion (of AA* + tI) as a 
tool to compute the symbolic Moore-Penrose inverse. As 
is well-known, exact arithmetic solutions, and even more 
so exact symbolic solutions, lead very quickly to com- 
putationally intractable problems. Nonetheless if your 
problem contains only one or two parameters, and isn’t 
of too high a dimension, then efficiency and insight can 
be gained by using a symbolic inverse or Moore-Penrose 
inverse. For more discussion and examples, see [2]. 


5 Fraction-free LU factoring 


In this section we generalize the PA = LU factorization 
to the fraction-free case, which appears not to have been 
done before. The paper [9] is based on a poster presented 
at ECCAD 97, in which they gave something they called 
(in quotes) a fraction-free ‘factorization’; we modify this 
here to give a true factorization. 

However, after writing down the true factorization (20 
below), one realizes that the information in the extra fac- 
tors Fy and Fy are duplicated in the U factor, and hence 
there is no need to form them explicitly except for con- 
ceptual understanding, and thus we see that while they 
do not give a true factorization, the treatment of [9] is 
complete and practical. 

Nonetheless, while no new information is discovered 
this way, this true factorization approach has some ad- 
vantages. First, it avoids Sylvester’s Identity, and indeed 
avoids determinants altogether. It also, in our view, gives 
a clearer explanation of just why we can pull common in- 
teger factors out of certain submatrices, which is the key 
to the whole algorithm. 

We first see a theorem and then work out an example 
in detail. The proof of the theorem follows the reason- 
ing in the example and is thus omitted. Maple code for 
the fraction-free factorization, which works for matrices 
with entries from arbitrary integral domains, can be found 
in [2]. 

Theorem 4: Fraction-Free Factorization. Consider the 
rectangular matrix A € Z"*™. Then we may write 


FPA= LFU, (20) 
where F; = diag(1, p1, pip2,..-,pip2---Pn—1), P is a per- 
mutation matrix, L € Z”*” is unit lower triangular, F2 = 
diag(1, 1, Pi, Pipa; +++,)P1p2-- -Pn-2), and U E garam, 1s 
upper triangular. The p; are the pivots that arise. 


Remarks. 


e This factorization (or rather, its construction by al- 
gorithm) gives a simple proof of divisibility of the 
submatrices by pı, p2, and so on. It becomes clear 
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that since we put the factors in, with Fi, and we 
ought to be able to take them out again, with Fo. 


We may use either the one-step, or the two-step, frac- 
tion free algorithm (see [5] for details) to construct 
the factorization, which is the same in either case. 
Since the two-step method is asymptotically more ef- 
ficient than the one-step method, we should use that. 
For clarity, we do not. 


Formation of the factors in F; is not actually neces- 
sary. We may simply record the pivots pı, p2, and 
in fact even this is not necessary, since the pivots are 
already recorded in the diagonal entries of U. 


The determinants of each side are 
Pips”? + -Pn—1 det(A) 


and 


pL py? + pa—2 det(U) 


respectively. This shows that 


det(U) = pupa ---Pn—1det(A) . 


If zero pivots are encountered, or indeed if we wish 
to select the smallest pivot so as to encourage the 
least growth in the integers that arise (this heuristic 
is well-known and works very well), then we must do 
row exchanges. By the usual trick, we may pretend 
that we knew ahead of time which rows would be 
exchanged, and do them all to start with. This is 
why the P factor is next to the A factor in the state- 
ment of the theorem. The details of the percolation 
of the permutations (via column exchanges and row 
exchanges) through all the various factors are left as 
an exercise. 


The factor U is different from that of the Turing 
factorization, in that it is not unit upper triangular. 


As with the Turing factorization, once the pivoting 
strategy has been chosen then the factorization is (up 
to ties for pivots) unique. 

The integral domain Z is not special; this factoriza- 
tion works for matrices over any integral domain. 


We may combine the Fraction-Free Factorization 
with the Turing factorization to get FPA = 
LFU F3H, which gives a a factorization for the Her- 
mite normal form H. 


5.1 Example 


We use example 9.1 from [5], which has the augmented 
matrix 


3 4 —2 1 —2 

1 —1 2 2 7 
As (21) 
4 —3 4 —3 2 i 

—1 1 6 —l1 I 


In what follows we appear to temporarily allow divi- 
sions. This is a notational device only, for exposition, and 
it should be clear how to avoid ever forming any fractions 
even temporarily. 

Applying one elementary matrix step of ordinary PA = 
LU factorization to this matrix would give A = 


1 3 4 2 1 —2 
1/3 1 —7/3 8/3 5/3 = 
4/3 1 -3 2 13/3 14/3 

-1/3 1 7/3 16/3 -2/3 1/3 

(22) 


To remove the fractions, we may rewrite the identity ma- 
trix as 


1 1 
(23) 


1/3 3 


and insert these two factors in between the two factors 
of A we have found so far. Since multiplication by a 
diagonal matrix on the right multiplies columns, the 1’s 
on the diagonal of the L factor all become 1/3. Once this 
happens, we may factor 1/3 out of each row, giving 


1 1 
1/3 pa 
A = 
1/3 4 1 
1/3 zl 1 
ŞA en re 
7 8 5 2B 
x 
-25 20 Te 14 
7T 16 -2 1 


Of course, multiplying both sides by diag(1, 3,3,3) will 
remove the fractions completely. So far, we have not cap- 
tured the essence of the Bareiss-Jordan fraction-free al- 
gorithm; all we have done is cleared fractions in ordinary 
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PA = LU factorization. Indeed, this is just the begin- 
ning of what is called ‘division-free Gaussian elimination’ 
in [5]. [Apparently, ‘division-free’ is the accepted name for 
a slightly different algorithm, which is less clever than the 
‘fraction-free’ algorithm. We will not have cause to refer 
to ‘division-free’ elimination again.] We need to do one 
more step before the idea of ‘fraction-free factorization’ 
becomes clear. Call the last factor in the above equation, 
AC). We will work just with A), for easy typography. 

We start as before by pretending to use ordinary ratio- 
nal LU factorization steps. We may write AC) = 


1 oe ae | 29 
1 =f L$ 5 23 
235 1 69 _216 _ 477 
7 7 7 7 
=i 1 4 3 24 


(24) 
and again we will wish to clear the fractions (accidentally, 
the last multiplier was also 7 and so the entries in the last 
row are integers, but in general this will not happen). In 
actual fact the pivot was —7, so we'll adjust the minus 
signs above, and use a similar rewriting of the identity 
(this time as diag(1,1,—1/7,—-1/7)diag(1, 1,7, —7)) to 


arrive at 


1 1 
1 1 
AQ) = 

—1/7 -25 1 
—1/7 T 1 

3 4 —2 1 —2 

-7 8 5 23 

X 

60 216 477 
—168 —21 —168 


The percolation of the diag(1,1,—1/7,—1/7) factor 
through the left-hand factors already obtained is left as 
What concerns us now is the fact that ev- 
ery element of the remaining non-triangular submatrix, 
namely 


an exercise. 


60 216 
—168 —21 


is divisible by 3. 

A moment’s thought shows why this must be so. We 
introduced the factor 3 into the entire submatrix when 
we multiplied by diag(1,3,3,3). It was needed for the 
elimination of the first column, but is not needed here. 
We may now factor it out, to keep the size of the elements 
down. 

It is important to note that we will not need to do GCD 
calculations to discover this divisibility; we will know 


ATT 


: 25 
—168 a 


ahead of time that this divisibility will happen. Thus 
we may cheaply take advantage of it. 


Writing this observation as a factorization, we have 


that diag(1,1,-7, -7)A™ 


1 1 
= -25 1 3 

7 1 3 
3 4 -2 1 2 
-7 8 5 28 

x 

20 72 159 
=s 27 86 


Continuing with one more step, and rearranging all the 
factors that we have so far found, we get at last that 


1 1 
3 1 1 
—21 =, —28 —25 1 
—420 140 140 -56 1 
[ 1 ] [ 3 4 -2 1 —2 ] 
x 1 -7 8 5 23 
3 20 72 159 : 
—21 —556 —1112 


6 Concluding Remarks 


Recent releases of Maple offer a command LUdecomp that 
returns the generalized LU factoring described here. The 
proviso det(U,) 4 0 allows the user (at least in principle) 
to identify the values of the parameters that give excep- 
tional row echelon forms, and the program can be called 
again with these special values of the parameters. 


This idea of a row echelon decomposition may be useful 
numerically, as well, if we make the usual shift to looking 
at the condition number of U; instead of the determinant. 
The reader will be interested to note that the idea of 
condition number, and the definition of ‘ill-conditioned’, 
was also introduced by Turing, in the same paper [16]. 
Thus we hope that the name “Turing factorization’ for 
the result of Theorem 1 is accepted. 


Further, the Turing factorization can be used to solve 
all the major computational problems encountered in a 
first linear algebra course, such as solution of linear equa- 
tions, positive definiteness of quadratic forms, Gram- 
Schmidt orthogonalization (see [14]), and eigenvalues. 
This elegant unity is surely a consequence of Turing’s 
mathematical taste; even in the simple fields, he was at- 
tracted to the key ideas. 
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